Causal Inference 3 Tutorial
Confounders
- Something that affects both the probability of receiving the treatment and the value of the outcome
- When there are confounders, we can’t know what portion of the difference between treatment and control group that is caused by the treatment vs something correlated with the probability of receiving the treatment
- To account for confounders, we can control for them in our model
set.seed(1234568)
## Confounders: generating data according to a confounder structure
n = 1000 # sample size
# Generate a treatment variable
x1 = rnorm(n, mean = 1, sd = 0.1)
# look at the data
ggplot() +
geom_histogram(aes(x = x1), bins = 30, fill = "skyblue", color = "black") +
geom_vline(xintercept = mean(x1), linetype = "dashed", color = "red", size = 1) +
labs(x = "x1", y = "Frequency", title = "Histogram of x1") +
annotate("text", x = mean(x1) + 0.04, y = 100, label = paste0("Mean = ",round(mean(x1), digits = 0)), color = "red", size = 4)# Generate a confounder
z = rnorm(n, mean = 3, sd = 0.3)
# Generate an outcome variable
y = 2*x1 + 3*z + rnorm(n, mean = 0, sd = 1)
# create a table showing estimates of x1 with and without x2
modelsummary::modelsummary(
list(lm(y ~ x1), lm(y ~ x1 + z)),
estimate = "{estimate}{stars} ({std.error})",
statistic = NULL,
gof_omit = 'IC|RMSE|Log|F|R2$|Std.')| Model 1 | Model 2 | |
|---|---|---|
| (Intercept) | 8.739*** (0.406) | 0.880* (0.429) |
| x1 | 2.294*** (0.404) | 1.915*** (0.309) |
| z | 2.746*** (0.103) | |
| Num.Obs. | 1000 | 1000 |
| R2 Adj. | 0.030 | 0.431 |
Colliders
[1] 0.05157384
| Model 1 | Model 2 | |
|---|---|---|
| (Intercept) | 2.948*** (0.033) | 0.570*** (0.041) |
| x1 | 0.053 (0.032) | −0.761*** (0.020) |
| x2 | 0.399*** (0.006) | |
| Num.Obs. | 1000 | 1000 |
| R2 Adj. | 0.002 | 0.797 |
Post-Treatment Mechanism Bias
set.seed(2233)
## Confounders: generating data according to a confounder structure
n = 1000 # sample size
# Generate a treatment variable
x1 = rbinom(n, 1, 0.5)
# Generate a mechanism
mechanism = rnorm(n, mean = x1 * 2, sd = 1)
# Generate an outcome variable
y = 0.5 * x1 + .7 * mechanism + rnorm(n)
# create a table showing estimates of x1 with and without x2
modelsummary::modelsummary(
list(lm(y ~ x1), lm(y ~ x1 + mechanism)),
estimate = "{estimate}{stars} ({std.error})",
statistic = NULL,
gof_omit = 'IC|RMSE|Log|F|R2$|Std.')| Model 1 | Model 2 | |
|---|---|---|
| (Intercept) | −0.033 (0.054) | −0.025 (0.043) |
| x1 | 1.956*** (0.075) | 0.517*** (0.087) |
| mechanism | 0.704*** (0.030) | |
| Num.Obs. | 1000 | 1000 |
| R2 Adj. | 0.402 | 0.612 |